Compact and Fast Algorithms for Regular Expression Search
نویسنده
چکیده
This paper describes an improvement of the brute force determinization algorithm in the case of homogeneous NFAs, as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it be a fail-safe procedure contrary to the classical subset construction. We investigate the particular case of pattern matching based on Glushkov NFAs, which involves the determinization of an almost homogeneous automaton. Actually, our algorithm is inspired both by recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and by the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. We obtain an algorithm with an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time.
منابع مشابه
SECURING INTERPRETABILITY OF FUZZY MODELS FOR MODELING NONLINEAR MIMO SYSTEMS USING A HYBRID OF EVOLUTIONARY ALGORITHMS
In this study, a Multi-Objective Genetic Algorithm (MOGA) is utilized to extract interpretable and compact fuzzy rule bases for modeling nonlinear Multi-input Multi-output (MIMO) systems. In the process of non- linear system identi cation, structure selection, parameter estimation, model performance and model validation are important objectives. Furthermore, se- curing low-level and high-level ...
متن کاملA Differential Evolution and Spatial Distribution based Local Search for Training Fuzzy Wavelet Neural Network
Abstract Many parameter-tuning algorithms have been proposed for training Fuzzy Wavelet Neural Networks (FWNNs). Absence of appropriate structure, convergence to local optima and low speed in learning algorithms are deficiencies of FWNNs in previous studies. In this paper, a Memetic Algorithm (MA) is introduced to train FWNN for addressing aforementioned learning lacks. Differential Evolution...
متن کاملFast and compact regular expression matching
We study 4 problems in string matching, namely, regular expression matching, approximate regular expression matching, string edit distance, and subsequence indexing, on a standard word RAM model of computation that allows logarithmic-sized words to be manipulated in constant time. We show how to improve the space and/or remove a dependency on the alphabet size for each problem using either an i...
متن کاملFast Motif Search in Protein Sequence Databases
Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations is an important interactive process which requires fast motif-matching algorithms. In this paper, we explore and evaluate various representations of the database of sequences using suffix trees for two types of query ...
متن کاملPractical Private Regular Expression Matching
Regular expressions are a frequently used tool to search in large texts. They provide the ability to compare against a structured pattern that can match many text strings and are common to many applications, even programming languages. This paper extends the problem to the private two-party setting where one party has the text string and the other party has the regular expression. The privacy c...
متن کامل